Transcribing broadcast data using MLP features
نویسندگان
چکیده
This paper describes incorporating discriminative features from a multi layer perceptron (MLP) into a state-of-the-art Arabic broadcast data transcription system based on cepstral features. The MLP features are based on a recently proposed Bottle-Neck architecture with long-term warped LPTRAP speech representation at the input. It is shown that the previously reported improvements on a development Arabic transcription system carry through to a full system at a state-ofthe-art level. SAT, CMLLR and MLLR adaptation techniques are shown to be useful for both MLP and combined features, though to a lesser degree than for PLPs. Without adaptation, MLP features obtain superior performance to cepstral features in all test conditions, and with adaptation both feature sets give comparable results. Combining the features, either by feature concatenation or system hypotheses, gives significant gains. Gains from MMI model training seem to be additive to the gain coming from discriminative MLP features.
منابع مشابه
Efficient generation and use of MLP features for Arabic speech recognition
Front-end features computed using Multi-Layer Perceptrons (MLPs) have recently attracted much interest, but are a challenge to scale to large networks and very large training data sets. This paper discusses methods to reduce the training time for the generation of MLP features and their use in an ASR system using a variety of techniques: parallel training of a set of MLPs on different data sub-...
متن کاملThe efficient incorporation of MLP features into automatic speech recognition systems
In recent years, the use of Multi-Layer Perceptron (MLP) derived acoustic features has become increasingly popular in automatic speech recognition systems. These features are typically used in combination with standard short-term spectral-based features, and have been found to yield consistent performance improvements. However there are a number of design decisions and issues associated with th...
متن کاملTranscribing broadcast news with the 1997 Abbot System
Recent DARPA CSR evaluations have focused on the transcription of broadcast news from both television and radio programmes [17]. This is a challenging task because the data includes a variety of speaking styles and channel conditions. This paper describes the development of a connectionist-hidden Markov model (HMM) system, and the enhancements designed to improve performance on broadcast news d...
متن کاملOn the Use of MLP Features for Broadcast News Transcription
Multi-Layer Perceptron (MLP) features have recently been attracting growing interest for automatic speech recognition due to their complementarity with cepstral features. In this paper the use of MLP features is evaluated in a large vocabulary continuous speech recognition task, exploring different types of MLP features and their combination. Cepstral features and three types of BottleNeck MLP ...
متن کاملData-driven clustered hierarchical tandem system for LVCSR
In tandem systems, the outputs of multi-layer perceptron (MLP) classifiers have been successfully used as features for HMM-based automatic speech recognition. In this paper, we propose a data-driven clustered hierarchical tandem system that yields improved performance on a large-vocabulary broadcast news transcription task. The complicated global learning for a large monolithic MLP classifier i...
متن کامل